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(57) Abstract: An apparatus for efficiently performing spatial scalable compression of an input video stream is disclosed- A base 
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FIELD OF THE INVENTION 

The invention relates to a video encoder, and more particularly to a video 
encoder which uses spatial scalable compression schemes to produce a plurality of base 
streams and a plurality enhancement streams. 

5 

BACKGROUND OF THE INVENTION 

Because of the massive amounts of data inherent in digital video, the 
transmission of full-motion, high-definition digital video signals is a significant problem in 
the development of high-definition television. More particularly, each digital image fi-ame is 

10 a still image formed fi-om an array of pixels according to the display resolution of a particular 
system. As a result, the amoimts of raw digital information included in high resolution video 
sequences are massive. In order to reduce the amount of data that must be sent, compression 
schemes are used to compress the data. Various video compression standards or processes 
have been estabHshed, including, MPEG-2, MPEG-4, H.263 and H.264. 

15 Many applications are enabled where video is available at various resolutions 

and/or qualities in one stream. Methods to accomplish this are loosely referred to as 
scalability techniques. There are three axes on which one can deploy scalability. The first is 
scalability on the time axis, often referred to as temporal scalability. Secondly, there is 
scalabiHty on the quality axis, often referred to as signal-to-noise scalability or fine-grain 

20 scalability. The third axis is the resolution axis (number of pixels in image) often referred to 
as spatial scalability or layered coding. In layered coding, the bitstream is divided into two or 
more bitstreams, or layers. Each layer can be combined to form a single high quality signal. 
For example, the base layer may provide a lower quality video signal, while the enhancement 
layer provides additional information that can enhance the base layer image. 

25 In particular, spatial scalability can provide compatibility between different 

video standards or decoder capabilities. With spatial scalability, the base layer video may 
have a lower resolution than the input video sequence, in which case the enhancement layer 
carries information which can restore the resolution of the base layer to the input sequence 
level. 
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Most video compression standards support spatial scalability. Figure 1 
illustrates a block diagram of an encoder 100 which supports MPEG-2/MPEG-4 spatial 
scalability. The encoder 100 comprises a base encoder 112 and an enhancement encoder 
114. The base encoder is comprised of a low pass filter and downsampler 120, a motion 
5 estimator 122, a motion compensator 124, an orthogonal transform (e.g., Discrete Cosine 
Transform (DCT)) circuit 130, a quantizer 132, a variable length coder 134, a bitrate control 
circuit 135, an inverse quantizer 138, an inverse transform circuit 140, switches 128, 144, and 
an interpolate and upsample circuit 150. The enhancement encoder 114 comprises a motion 
estimator 154, amotion compensator 155, a selector 156, an orthogonal transform (e.g., 

10 Discrete Cosine Transform (DCT)) circuit 158, a quantizer 160, a variable length coder 162, 
a bitrate control circuit 164, an inverse quantizer 166, an inverse transform circuit 168, 
switches 170 and 172. The operations of the individual components are well known in the art 
and will not be described in detail. 

Unfortunately, the coding efficiency of this layered coding scheme is not very 

15 good. Indeed, for a given picture quality, the bitrate of the base layer and the enhancement 
layer together for a sequence is greater than the bitrate of the same sequence coded at once. 

Figure 2 illustrates another known encoder 200 proposed by DemoGrafx. The 
encoder is comprised of substantially the same components as the encoder 100 and the 
operation of each is substantially the same so the individual components will not be 

20 described. In this configuration, the residue difference between the input block and the 
upsampled output fi^om the upsampler 150 is inputted into a motion estimator 154. To 
guide/help the motion estimation of the enhancement encoder, the scaled motion vectors 
from the base layer are used in the motion estimator 154 as indicated by the dashed line in 
Figure 2. However, this arrangement does not significantly overcome the problems of the 

25 arrangement illustrated in Figure 1 . 

SUMMARY OF THE INVENTION 

It is an object of the invention to overcome at least part of the above-described 
deficiencies of the known spatial scalability schemes by providing a spatial scalable 
30 compression scheme which produces a plurality of base streams with differing quahty levels 
and a plurality of enhancement streams with differing quality levels. 

According to one embodiment of the invention, an apparatus for efficiently 
performing spatial scalable compression of an input video stream is disclosed. A base 
encoder encodes a base encoder stream. Modifying means modifies content of the base 
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encoder stream to create a plurality of base streams. An enhancement encoder encodes an 
enhancement encoder stream. Modifying means modifies content of the enhancement 
encoder stream to create a plurality of enhancement streams. 

According to another embodiment of the invention, a method and apparatus 
5 for providing spatial scalable compression of an input video stream is disclosed. The input 
video stream is downsampled to reduce the resolution of the video stream. The 
downsampled video stream is encoded to produce a base encoder stream. A plurality of base 
streams are created from the base encoder stream. The base encoder stream is decoded and 
upconverted to produce a reconstructed video stream. The expected motion between frames 

10 from the input video stream and the reconstructed video stream is estimated and motion 
vectors for each frame of the received streams is calculated based upon an upscaled base 
layer plus enhancement layer. The reconstructed video stream is subtracted from the video 
stream to produce a residual stream. A predicted stream is calculated using the motion 
vectors in a motion compensation unit. The predicted stream is subtracted from the residual 

15 stream. The resulting residual stream is encoded and an enhancement encoder stream is 
outputted. A plurality of enhancement streams are created from the enhancement encoder 
stream. 

According to another embodiment of the invention, a method and apparatus 
for decoding a plurality of coded video signals is disclosed. Each of the video streams is 

20 decoded and then the video streams are combined. An inverse quantization operation is 
perfomied on quantization coefficients in the decoded video streams to produce DCT 
coefficients. An inverse DCT operation is performed on the DCT coefficients to produce a 
first signal. Predicted pictures are produced in a motion compensator and the first signal and 
the predicted pictures are combined to produce an output signal. 

25 These and other aspects of the invention will be apparent from and elucidated 

with reference to the embodiments described hereafter. 



BRIEF DESCRIPTION OF THE DRAWBSfGS 

The invention will now be described, by way of example, with reference to the 
30 accompanying drawings, wherein: 

Figure 1 is a block schematic representation of a known encoder with spatial 

scalability; 

Figure 2 is a block schematic representation of a known encoder with spatial 

scalability; 
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Figure 3 is a block schematic representation of an encoder with spatial 
scalability according to one embodiment of the invention; 

Figure 4 illustrates a modifying device with attenuators in series according to 
one embodiment of the invention; 
5 Figure 5 illustrates a modifying device with attenuators in cascade according 

to one embodiment of the invention; and 

Figure 6 illustrates a decoder according to one embodiment of the invention. 

DETAILED DESCRIPTION OF THE INVENTION 

10 Figure 3 is a schematic diagram of an encoder according to one embodiment 

of the invention. The depicted encoding system 300 accomplishes layered compression, 
whereby a portion of the channel is used for providing a plurality of lower resolution base 
layers and the remaining portion is used for transmitting a plurality of enhancement layers, 
whereby various base layers and base and enhancement layers can be combined to create 

15 video streams of differing quality levels. It will be understood by those skilled in the art that 
other encoding arrangements can also be used to create multilayered base and enhancement 
video streams and the invention is not limited thereto. 

The encoder 300 comprises a base encoder 312 and an enhancement encoder 
314. The base encoder is comprised of a low pass fiher and downsampler 320, a motion 

20 estimator 322, a motion compensator 324, an orthogonal transform (e.g., Discrete Cosine 
Transform (DCT)) circuit 330, a quantizer 332, a variable length coder (VLC) 334, a bitrate 
control circuit 335, an inverse quantizer 338, an inverse transform circuit 340, switches 328, 
344, and an interpolate and upsample circuit 350. 

An input video block 316 is split by a splitter 318 and sent to both the base 

25 encoder 312 and the enhancement encoder 314. In the base encoder 3 12, the input block is 
inputted into a low pass filter and downsampler 320. The low pass filter reduces the 
resolution of the video block which is then fed to the motion estimator 322. The motion 
estimator 322 processes picture data of each frame as an I-picture, a P-picture, or as a B- 
picture. Each of the pictures of the sequentially entered frames is processed as one of the I-, 

30 P-, or B-pictures in a pre-set manner, such as in the sequence of I, B, P, B, P,. . B, P. That 
is, the motion estimator 322 refers to a pre-set reference frame in a series of pictures stored in 
a frame memory not illustrated and detects the motion vector of a macro-block, that is, a 
small block of 16 pixels by 16 lines of the frame being encoded by pattem matching (block 
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Matching) between the macro-block and the reference frame for detecting the motion vector 
of the macro-block. 

In MPEG, there are four picture prediction modes, that is an intra-coding 
(intra- frame coding), a forward predictive coding, a backward predictive coding, and a bi- 
5 directional predictive-coding. An I-picture is an intra-coded picture, a P-picture is an intra- 
coded or forward predictive coded or backward predictive coded picture, and a B-picture is 
an intra-coded, a forweird predictive coded, or a bi-directional predictive-coded picture. 

The motion estimator 322 performs forward prediction on a P-picture to detect 
its motion vector. Additionally, the motion estimator 322 performs forward prediction, 

10 backward prediction, and bi-directional prediction for a B-picture to detect the respective 
motion vectors. Li a known manner, the motion estimator 322 searches, in the frame 
memory, for a block of pixels which most resembles the cxirrent input block of pixels. 
Various search algorithms are known in the art. They are generally based on evaluating the 
mean absolute difference (MAD) or the mean square error (MSE) between the pixels of the 

15 current input block and those of the candidate block. The candidate block having the least 
MAD or MSE is then selected to be the motion-compensated prediction block. Its relative 
location with respect to the location of the current input block is the motion vector. 

Upon receiving the prediction mode and the motion vector from the motion 
estimator 322, the motion compensator 324 may read out encoded and already locally 

20 decoded picture data stored in the frame memory in accordance with the prediction mode and 
the motion vector and may supply the read-out data as a prediction picture to arithmetic unit 
325 and switch 344. The arithmetic unit 325 also receives the input block and calculates the 
difference between the input block and the prediction picture from the motion compensator 
324. The difference value is then supplied to the DCT circuit 330. 

25 If only the prediction mode is received from the motion estimator 322, that is, 

if the prediction mode is the intra-coding mode, the motion compensator 324 may not output 
a prediction picture. In such a situation, the arithmetic unit 325 may not perform the above- 
described processing, but instead may directly output the input block to the DCT circuit 330. 

The DCT circuit 330 performs DCT processing on the output signal from the 

30 arithmetic unit 33 so as to obtain DCT coefficients which are supplied to a quantizer 332. 
The quantizer 332 sets a quantization step (quantization scale) in accordance with the data 
storage quantity in a buffer (not illustrated) received as a feedback and quantizes the DCT 
coefficients from the DCT circuit 330 using the quantization step. The quantized DCT 
coefficients are supplied to the VLC unit 334 along with the set quantization step. 
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The VLC vmit 334 converts the quantization coefficients suppUed firom the 
quantizer 332 into a variable length code, such as a Huffinan code, in accordance with the 
quantization step supplied from the quantizer 332. The resulting converted quantization 
coefficients are outputted to a buffer not illustrated. The quantization coefficients and the 
5 quantization step are also supplied to an inverse quantizer 338 which dequantizes the 

quantization coefficients in accordance with the quantization step so as to convert the same to 
DCT coefficients. The DCT coefficients are supplied to the inverse DCT unit 340 which 
performs inverse DCT on the DCT coefficients. The obtained inverse DCT coefficients are 
then supplied to the arithmetic unit 348, 

10 The arithmetic unit 348 receives the inverse DCT coefficients fi-om the inverse 

DCT unit 340 and the data from the motion compensator 324 depending on the location of 
switch 344. The arithmetic imit 348 sums the signal (prediction residuals) from the inverse 
DCT unit 340 to the predicted picture from the motion compensator 324 to locally decode the 
original picture. However, if the prediction mode indicates intra-coding, the output of the 

15 inverse DCT unit 340 may be directly fed to the frame memory. The decoded picture 

obtained by the arithmetic unit 340 is sent to and stored in the frame memory so as to be used 
later as a reference picture for an inter-coded picture, forward predictive coded picture, 
backward predictive coded picture, or a bi-directional predictive coded picture. 

The quantization coefficients from the quantizer 332 are also applied to a 

20 modifying means 400. The modifying device 400 comprises a plurality of attenuation steps 
which can be arranged in series as illustrated in Figure 4 or in cascade or parallel as 
illustrated in Figure 5. As illustrated in Figure 4, the quantization coefficients from the 
quantizer 332 are applied to an attenuator 401 . The signal is then attenuated by the attenuator 
401 which results in attenuated DCT coefficients carried by a signal 407. In series with the 

25 attenuator 401, a second attenuator 403 attenuates the ampHtude of the DCT coefficients 

carried by the signal 407 and delivers new attenuated coefficients carried by signal 413, that 
are variable length coded by a variable length coder 422 for generating a first base video 
stream BaseBaseO. 

The attenuators 401 and 403 are composed of an inverse quantizer 402 and 
30 408, respectively, a weighting device 404 and 410, respectively, followed in series by a 

quantizer 406 and 412, respectively. The quantization coefficients from the quantizer 332 are 
inverse quantized by the inverse quantizer 402. The weighting is performed by a 8*8 
weighting matrix multiplied to DCT blocks, each DCT coefficient being thus multiplied by a 
weighting factor contained in the matrix, the results of each multiplication being rounded to 
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the nearest integer, weighting matrix being filled by values which amplitude are between 0 
and 1, set for example to non-uniform values close to 1 for low frequential values and close 
to 0 for high frequential values, or to unifomi values so that all coefficients in the 8*8 DCT 
block are equally attenuated. The quantization step consists of dividing weighted DCT 
5 coefficients by a new quantization factor for delivering quantized DCT coefficients, said 
quantization factor being the same for all coefficients of all 8*8 blocks composing a 
macroblock. 

The coding error 415 relative to the attenuator 401 is generated by subtracting 
signal 407 from a signal from the quantizer 332 by means of a subtraction unit 414. The 

10 coding error 415 is then variable length coded by a variable length coder 416 for generating a 
base enhancement video stream BaseEnh2. The coding error 419 relative to the attenuator 
403 is generated by subtracting a signal 413 from signal 407 by means of a subtraction unit 
418. The coding error 419 is then variable length coded by a variable length encoder 420 for 
generating a second base enhancement video stream BaseEnhl. 

15 In this example, the minimum quality base resolution would be provided by 

the video stream BaseBaseO. A medium quality base resolution would be provided by 
combining the video stream BaseBaseO with the video stream BaseEnhO. A high quality base 
resolution would be provided by combining the video stream BaseBaseO, BaseEnhO and 
BaseEnhl. 

20 The enhancement encoder 314 comprises a motion estimator 354, a motion 

compensator 356, a DCT circuit 368, a quantizer 370, a VLC unit 372, a bitrate controller 
374, an inverse quantizer 376, an inverse DCT circuit 378, switches 366 and 382, subtractors 
358 and 364, and adders 380 and 388. In addition, the enhancement encoder 314 may also 
include DC-offsets 360 and 384, adder 362 and subtractor 386. The operation of many of 

25 these components is similar to the operation of similar components in the base encoder 312 
and will not be described in detail. 

The output of the arithmetic unit 340 is also supplied to the upsampler 350 
which generally reconstructs the filtered out resolution from the decoded video stream and 
provides a video data stream having substantially the same resolution as the high-resolution 

30 input. However, because of the filtering and losses resulting from the compression and 
decompression, certain errors are present in the reconstructed stream. The errors are 
determined in the subtraction unit 358 by subtracting the reconstructed high-resolution 
stream from the original, unmodified high resolution stream. 
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According to one embodiment of the invention illustrated in Figure 3, the 
original unmodified high-resolution stream is also provided to the motion estimator 354. The 
reconstructed high-resolution stream is also provided to an adder 388 which adds the output 
fi*om the inverse DCT 378 (possibly modified by the output of the motion compensator 356 
5 depending on the position of the switch 382). The output of the adder 388 is supplied to the 
motion estimator 354. As a result, the motion estimation is performed on the upscaled base 
layer plus the enhancement layer instead of the residual difference between the original high- 
resolution stream and the reconstructed high-resolution stream. This motion estimation 
produces motion vectors that track the actual motion better than the vectors produced by the 
10 known systems of Figures 1 and 2. This leads to a perceptually better picture quality 
especially for consumer applications which have lower bit rates than professional 
applications. 

Furthermore, a DC-offset operation followed by a clipping operation can be 
introduced into the enhancement encoder 314, wherein the DC-offset value 360 is added by 

15 adder 362 to the residual signal output fi-om the subtraction unit 358. This optional DC-offset 
and clipping operation allows the use of existing standards, e.g., MPEG, for the enhancement 
encoder where the pixel values are in a predetermined range, e.g., 0. . .255, The residual 
signal is normally concentrated around zero. By adding a DC-offset value 360, the 
concentration of samples can be shifted to the middle of the range, e.g., 128 for 8 bit video 

20 samples. The advantage of this addition is that the standard components of the encoder for 
the enhancement layer can be used and result in a cost efficient (re-use of IP blocks) solution. 

The various enhancement layer video streams are created in a similar manner 
as the creation of the multiple base video streams described above. The quantization 
coefficients firom the quantizer 370 are also applied to the modifying device 450. The 

25 modifying device 450 may have the same elements as the modifying device 400 illustrated in 
Figure 4, and in the following description the same reference numerals will be used for like 
elements. The quantization coefficients firom the quantizer 370 are applied to the attenuator 
401. The signal is then attenuated by the attenuator 401 which results in attenuated DCT 
coefficients carried by a signal 407. In series with the attenuator 401, a second attenuator 

30 403 attenuates the amplitude of the DCT coefficients carried by the signal 407 and delivers 
new attenuated coefficients carried by signal 413, that are variable length coded by a variable 
length coder 422 for generating a first enhancement video stream EnhBaseO. 

The attenuators 401 and 403 are composed of an inverse quantizer 402 and 
408, respectively, a weighting device 404 and 4410, respectively, followed in series by a 
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quantizer 406 and 412, respectively. The weighting is performed by a 8*8 weighting matrix 
muhiplied to DCT blocks, each DCT coefficient being thus multipUed by a weighting factor 
contained in the matrix, the results of each multiplication being rounded to the nearest 
integer, weighting matrix being filled by values which amplitude are between 0 and 1, set for 
5 example to non-imiform values close to 1 for low frequential values and close to 0 for high 
frequential values, or to uniform values so that all coefficients in the 8*8 DCT block are 
equally attenuated. The quantization step consists of dividing weighted DCT coefficients by 
a new quantization factor for delivering quantized DCT coefficients , said quantization factor 
being the same for all coefficients of all 8*8 blocks composing a macroblock. 

10 The coding error 415 relative to the attenuator 401 is generated by subtracting 

signal 407 from a signal from the quantizer 370 by means of a subtraction unit 414. The 
coding error 415 is then variable length coded by a variable length coder 416 for generating a 
second enhancement video stream EnhEnh2. The coding error 419 relative to the attenuator 
403 is generated by subtracting a signal 413 from signal 407 by means of a subtraction unit 

15 418. The coding error 41 9 is then variable length coded by a variable length encoder 420 for 
generating a third base enhancement video stream EnhEnhl. 

In this example, the minimum quality full resolution would be provided by 
adding the video stream EnhBaseO to the high quality base resolution video stream. A 
medium quality fiill resolution would be provided by combining the video streams EnhBaseO 

20 and EnhEnhl with the high quality base resolution. A high quality full resolution would be 
provided by combining the video streams EnhBaseO, EnhEnhl and EnhEnh2 with the high 
quality base resolution. 

Figure 5 illustrates a modifying device wherein the attenuators are connected 
in cascade or parallel. It will be understood that the modifying device 500 can be used in 

25 both the base layer and the enhancement layer as a substitute for modifying devices 400 and 
450. The quantization coefficients from the quantizer 332 (or quantizer 370) are supplied to 
the first attenuator 501. The attenuator 501 comprises an inverse quantizer 502, a weighting 
device 504 and a quantizer 506. The quantization coefficients are inverse quantized in the 
inverse quantizer 502, then weighted and requantized, as described above with respect to 

30 Figure 4, in the weighting device 504 and the quantizer 506. The attenuated DCT 

coefficients carried by a signal 513 are then coded in a variable length coder 514 to produce a 
first base (enhancement) stream. 

The coding error 517 of the attenuator 501 is generated by subtracting the 
signal 517 from the signal from the quantizer 332 (quantizer 370) by means of a subtraction 
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unit 516. The coding error is applied to the second attenuator 503 which is comprised of an 
inverse quantizer 508, a weighting device 510 and a quantizer 512. The attenuated signal 
519 is encoded by a variable length coder 520 which produces a second base(or 
enhancement) stream. The coding error 523 of the attenuator 503 is generated by subtracting 
5 the signal 519 from the signal 5 1 7 by means of a subtraction unit 522. The coding error 523 
is encoded by a variable length coder 524 which produces a third base (enhancement) stream. 

Figure 6 illustrates a decoder according to one embodiment of the invention 
for decoding the multiple base or enhancement streams produced by the modifying devices. 
The multiple base (enhancement) streams are decoded by a plurality of variable length 

10 decoders 602, 604 and 606. The decoded streams are then added together in an arithmetic 
unit 608. The decoded quantization coefficients in the combined stream are supplied to an 
inverse quantizer 610 which dequantizes the quantization coefficient in accordance with the 
quantization step so as to convert the quantization coefficients into DCT coefficients. The 
DCT coefficients are supplied to the inverse DCT unit 612 which performs inverse DCT on 

15 the DCT coefficients. The obtained inverse DCT coefficients are then supplied to the 

arithmetic unit 614. The arithmetic unit 614 receives the inverse DCT coefficients fi-om the 
inverse DCT vmit 612 and data (produced in a known manner) fi-om a motion compensator 
616. The arithmetic unit 614 sums the stream fi-om the inverse DCT unit 612 to the predicted 
picture from the motion compensator 616 to produce the decoded base (or enhancement) 

20 stream. The decoded base and enhancement streams can be combined in a known manner to 
create the decoded video output. 

It will be understood that the different embodiments of the invention are not 
limited to the exact order of the above-described steps as the timing of some steps can be 
interchanged without affecting the overall operation of the invention. Furthermore, the term 

25 "comprising" does not exclude other elements or steps, the terms "a" and "an" do not exclude 
a plurality and a single processor or other unit may fiilfiU the fimctions of several of the units 
or circuits recited in the claims. 
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1 , An apparatus for efficiently performing spatial scalable compression of an 

input video stream, comprising: 

a base encoder for encoding a base encoder stream; 

means for modifying content of the base encoder stream to create a plurality of 

5 base streams; 

an enhancement encoder for encoding an enhancement encoder stream; and 
means for modifying content of the enhancement encoder stream to create a 
pluraUty of enhancement streams. 

10 2. The apparatus according to claim 1, wherein said modifying is performed by a 

set of attenuation steps applied to coefficients composing said base encoder stream being 
assembled in series and a re-encoding step associated to each of said attenuation steps for 
delivering one of said plurality of base streams from a coding error by each attenuation step. 

15 3. The apparatus according to claim 1 , wherein said modifying is performed by a 

set of attenuation steps applied to coefficients composing said enhancement encoder stream 
being assembled in series and a re-encoding step associated to each of said attenuation steps 
for delivering one of said plurality of enhancement streams from a coding error by each 
attenuation step. 

20 

4. The apparatus according to claim 1 , wherein said modifying is performed by a 
set of attenuation steps applied to coefficients composing said base encoder stream being 
assembled in cascade and a re-encoding step associated to each of said attenuation steps for 
delivering one of said plurality of base streams from a coding error by each attenuation step. 

25 

5. The apparatus according to claim 1, wherein said modifying is performed by a 
set of attenuation steps applied to coefficients composing said enhancement encoder stream 
being assembled in cascade and a re-encoding step associated to each of said attenuation 
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steps for delivering one of said plurality of enhancement streams from a coding error by each 
attenuation step. 

6. A layered encoder for encoding an input video stream, comprising: 

5 a downsampling unit for reducing the resolution of the video stream; 

a base encoder for encoding a base encoder stream; 

means for creating a plurality of base streams by modifying content of the base 
encoder stream; 

an upconverting unit for decoding and increasing the resolution of the base 
10 encoder stream to produce a reconstructed video stream; 

a motion estimation unit which receives the input video stream and the 
reconstructed video stream and calculates motion vectors for each frame of the received 
streams based upon an upscaled base layer plus enhancement layer; 

a first subtraction unit for subtracting the reconstructed video stream from the 
1 5 input video stream to produce a residual stream; 

a motion compensation unit which receives the motion vectors from the 
motion estimation unit and produces a predicted stream; 

a second subtraction imit for subtracting the predicted stream from the residual 

stream; 

20 an enhancement encoder for encoding the resulting stream from the 

subtraction imit and outputting an enhancement encoder stream; 

means for creating a plurahty of enhancement streams by modifying content of 
the enhancement encoder stream. 

25 7. The layered encoder according to claim 6, wherein said means for creating a 

plurality of base streams comprises: 

a set of attenuation means applied to coefficients composing the base encoder 
stream, said attenuation means being assembled in series for delivering one of said plurality 
of base streams; 

30 re-encoding means associated with each attenuation means for delivering one 

of said plurality of base streams, from a coding error generated by each attenuation means. 



8. The layered encoder according to claim 6, wherein said means for creating a 

plurality of base streams comprises: 
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a set of attenuation means applied to coefficients composing the base encoder 
stream, said attenuation means being assembled in cascade for delivering one of said plurality 
of base streams; 

re-encoding means associated with each attenuation means for delivering one 
5 of said plurality of base streams, from a coding error generated by each attenuation means. 

9. The layered encoder according to claim 7, wherein means for creating a 
plurality of enhancement streams comprises: 

a set of attenuation means applied to coefficients composing the enhancement 
10 encoder stream, said attenuation means being assembled in series for delivering one of said 
plurality of enhancement streams; 

re-encoding means associated with each attenuation means for delivering one 
of said plurality of enhancement streams, from a coding error generated by each attenuation 
means. 

15 

10. The layered encoder according to claim 8, wherein means for creating a 
plurality of enhancement streams comprises: 

a set of attenuation means applied to coefficients composing the enhancement 
encoder stream, said attenuation means being assembled in cascade for delivering one of said 
20 plurality of enhancement streams; 

re-encoding means associated with each attenuation means for delivering one 
of said plurality of enhancement streams, from a coding error generated by each attenuation 
means. 

25 11. The layered encoder according to claim 7, wherein the attenuation means 

comprises frequential weighting means followed in series by quantization means for 
quantizing the coefficients, performed at the block level. 

12. The layered encoder according to claim 7, wherein each re-encoding means 

30 comprises subtracting means for subtracting an output signal from an input signal of the 
associated attenuation means for delivering the coding error, and variable length coding 
means for creating one of said base streams from the coding error. 
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13. The layered encoder according to claim 8, wherein the attenuation means 

comprises frequential weighting means followed in series by quantization means for 
quantizing the coefficients, performed at the block level. 

5 14. The layered encoder according to claim 13, wherein each re-encoding means 

comprises subtracting means for subtracting an output signal from an input signal of the 
associated attenuation means for delivering the coding error, and variable length coding 
means for creating one of said base streams from the coding error. 

10 15. The layered encoder according to claim 9, wherein the attenuation means 

comprises frequential weighting means followed in series by quantization means for 
quantizing the coefficients, performed at the block level. 

16. The layered encoder according to claim 7, wherein each re-encoding means 
15 comprises subtracting means for subtracting an output signal from an input signal of the 

associated attenuation means for delivering the coding error, and variable length coding 
means for creating one of said enhancement streams from the coding error. 

17. A method for providing spatial scalable compression of an input video stream, 
20 comprising the steps of: 

downsampling the input video stream to reduce the resolution of the video 

stream; 

encoding the downsampled video stream to produce a base encoder stream; 
creating a plurality of base streams by modifying content of the base encoder 

25 stream; 

decoding and upconverting the base stream to produce a reconstructed video 

stream; 

estimating the expected motion between frames from the input video stream 
and the reconstructed video stream and calculating motion vectors for each frame of the 
30 received streams based upon an upscaled base layer plus enhancement layer; 

subtracting the reconstructed video stream from the video stream to produce a 
residual stream; 

calculating a predicted stream using the motion vectors in a motion 
compensation unit; 
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subtracting the predicted stream from the residual stream; 

encoding the resulting residual stream and outputting an enhancement encoder 

stream; and 

creating a plurality of enhancement streams by modifying content of the 
5 enhancement encoder stream. 

18. A decoder for decoding a pluraUty of coded video signals, comprising: 
a plurality of decoders, one for each video stream, for decoding said video 

streams; 

arithmetic unit for combining said decoded video streams; 
inverse quantization means for performing an inverse quantization operation 
on quantization coefficients in said decoded video streams to produce DCT coefficients; 

inverse DCT means for performing an inverse DCT operation on the DCT 
coefficients to produce a first signal; 

a motion compensation unit for producing predicted pictures; 
arithmetic unit for combining the first signal and the predicted pictures to 
produce an output signal. 

19. The decoder according to claim 18, wherein the plurality of coded video 
20 streams are base streams. 

20. The decoder according to claim 18, wherein the plurality of video streams are 
enhancement streams. 

25 21 . A method for decoding a plurality of coded video signals, comprising: 

decoding each of said video streams; 
combining said decoded video streams; 

performing an inverse quantization operation on quantization coefficients in 
said decoded video streams to produce DCT coefficients; 
30 performing an inverse DCT operation on the DCT coefficients to produce a 

first signal; 

producing predicted pictures in a motion compensator; 

combining the first signal and the predicted pictures to produce an output 

signal. 
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